Estimation of Speaker’s Height and Speech Sign

نویسنده

  • Sorin Dusan
چکیده

Estimation of speaker’s height and vocal tract length (VTL) from speech signal can have forensic and automatic speech recognition applications. It was suggested for a long time that there is a correlation between speaker’s VTL, on one side, and speaker’s height and formant frequencies, on another side. Until recently, these putative relationships have been empirically examined in studies employing relatively small numbers of speakers. Scattered studies presented intriguing results about the correlations between speaker’s height and various acoustic speech parameters. Due to lack of databases, few studies presented extensive comparative results between the actual speaker’s VTL and the estimated one from speech signal. This paper presents an analysis of correlations between various acoustic speech parameters and speaker’s height for a large number of speakers. It also presents a new method for an optimal estimation of speaker’s height and VTL from various acoustic speech parameters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Pitch-based Estimation o

To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker’s a...

متن کامل

Speaker adaptation in noisy environments based on parameter estimation using uncertain data

This paper describes new method for the speaker adaptation of HMM parameters in environments with background noise. This method is based on Bayesian estimation, and calculates the a posteriori distribution of cleanspeech HMM parameters from their a priori distribution by using noisy speech observations. The advantage of the method is that the distribution of the noise can be taken into account ...

متن کامل

Estimating the Stability and Dispersion of the Biometric Glottal Fingerprint in Continuous Speech

The speaker’s biometric voice fingerprint may be derived from voice as a whole, or from the vocal tract and glottal signals, after separation by inverse filtering. This last approach has been used by the authors in early work, where it has been shown that the biometric fingerprint obtained from the glottal source or related speech residuals gives a good description of the speaker’s identity and...

متن کامل

An Acoustic Study of Emotivity-Prosody Interface in Persian Speech Using the Tilt Model

This paper aims to explore some acoustic properties (i.e. duration and pitch amplitude of speech) associated with three different emotions: anger, sadness and joy against neutrality as a reference point, all being intentionally expressed by six Persian speakers. The primary purpose of this study is to find out if there is any correspondence between the given emotions and prosody patterning in P...

متن کامل

Statistical Study of Speaker’s Peculiarities of Utterances into Phrases Segmentation

The report is concerned with the experimental study of the idiosyncrasy of utterance-into-phrase segmentation observed in the speech of a popular Russian TV-anchorman and two TV-news readers. Comparative statistical estimation of relative frequencies of occurrence of pauses of various duration, frequencies of occurrence of phrases and pairs of phrases with a different number of accent units wer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005